CORE-17360: Add topics for mapper scheduled cleanup and update cleanup Avro records #1265
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem statement
As part of ongoing performance work, there is a requirement to move the storage of flow mapper states to a shared state store. This will break the current implementation of mapper state cleanup. Flow mapper states are currently cleaned up using a scheduler implemented in each flow mapper, which relies on states being partitioned across the set of mapper instances. This will no longer be happening.
This problem can be addressed using the new task scheduler implemented in the DB worker, but doing so will require a few new topics for the mapper to listen to.
Solution
Add two new topics. The first is the topic that scheduled tasks for the flow mapper will be written to. The second is a dedicated cleanup event topic that will be written to as part of processing the scheduled event.
In addition, the
ExecuteCleanup
event has been modified to take a list of IDs as a field. This will allow the processor of the scheduled task to batch up the cleanup requests being sent to the mapper workers.